Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

نویسندگان

  • Matthew E. Taylor
  • Peter Stone
  • Yaxin Liu
چکیده

Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but the most basic algorithms have often been found slow in practice. This empirical result has motivated the development of many methods that speed up reinforcement learning by modifying a task for the learner or helping the learner better generalize to novel situations. This article focuses on generalizing across tasks, thereby speeding up learning, via a novel form of transfer using handcoded task relationships. We compare learning on a complex task with three function approximators, a cerebellar model arithmetic computer (CMAC), an artificial neural network (ANN), and a radial basis function (RBF), and empirically demonstrate that directly transferring the action-value function can lead to a dramatic speedup in learning with all three. Using transfer via inter-task mapping (TVITM), agents are able to learn one task and then markedly reduce the time it takes to learn a more complex task. Our algorithms are fully implemented and tested in the RoboCup soccer Keepaway domain. This article contains and extends material published in two conference papers (Taylor and Stone, 2005; Taylor et al., 2005).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transfer Learning via Multiple Inter-task Mappings

In this paper we investigate using multiple mappings for transfer learning in reinforcement learning tasks. We propose two different transfer learning algorithms that are able to manipulate multiple inter-task mappings for both model-learning and model-free reinforcement learning algorithms. Both algorithms incorporate mechanisms to select the appropriate mappings, helping to avoid the phenomen...

متن کامل

Transfer Learning for Policy Search Methods

An ambitious goal of transfer learning is to learn a task faster after training on a different, but related, task. In this paper we extend a previously successful temporal difference (Sutton & Barto, 1998) approach to transfer in reinforcement learning (Sutton & Barto, 1998) tasks to work with policy search. In particular, we show how to construct a mapping to translate a population of policies...

متن کامل

Transfer Learning for Policy Search Methods

An ambitious goal of transfer learning is to learn a task faster after training on a different, but related, task. In this paper we extend a previously successful temporal difference (Sutton & Barto, 1998) approach to transfer in reinforcement learning (Sutton & Barto, 1998) tasks to work with policy search. In particular, we show how to construct a mapping to translate a population of policies...

متن کامل

Automatically Mapped Transfer between Reinforcement Learning Tasks via Three-Way Restricted Boltzmann Machines

Reinforcement learning applications are hampered by the tabula rasa approach taken by existing techniques. Transfer for reinforcement learning tackles this problem by enabling the reuse of previously learned results, but requires an inter-task mapping to encode how the previously learned task and the new task are related. This paper presents an autonomous framework for learning inter-task mappi...

متن کامل

Transferring Evolved Reservoir Features in Reinforcement Learning Tasks

The major goal of transfer learning is to transfer knowledge acquired on a source task in order to facilitate learning on another, different, but usually related, target task. In this paper, we are using neuroevolution to evolve echo state networks on the source task and transfer the best performing reservoirs to be used as initial population on the target task. The idea is that any non-linear,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2007